Search CORE

168 research outputs found

Methods for Interpreting and Understanding Deep Neural Networks

Author: Montavon Grégoire
Müller Klaus-Robert
Samek Wojciech
Publication venue: 'Elsevier BV'
Publication date: 24/06/2017
Field of study

This paper provides an entry point to the problem of interpreting a deep neural network model and explaining its predictions. It is based on a tutorial given at ICASSP 2017. It introduces some recently proposed techniques of interpretation, along with theory, tricks and recommendations, to make most efficient use of these techniques on real data. It also discusses a number of practical applications.Comment: 14 pages, 10 figure

arXiv.org e-Print Archive

Fraunhofer-ePrints

MPG.PuRe

Learning Sparse & Ternary Neural Networks with Entropy-Constrained Trained Ternarization (EC2T)

Author: Becking Daniel
Marban Arturo
Samek Wojciech
Wiedemann Simon
Publication venue
Publication date: 25/05/2020
Field of study

Deep neural networks (DNN) have shown remarkable success in a variety of machine learning applications. The capacity of these models (i.e., number of parameters), endows them with expressive power and allows them to reach the desired performance. In recent years, there is an increasing interest in deploying DNNs to resource-constrained devices (i.e., mobile devices) with limited energy, memory, and computational budget. To address this problem, we propose Entropy-Constrained Trained Ternarization (EC2T), a general framework to create sparse and ternary neural networks which are efficient in terms of storage (e.g., at most two binary-masks and two full-precision values are required to save a weight matrix) and computation (e.g., MAC operations are reduced to a few accumulations plus two multiplications). This approach consists of two steps. First, a super-network is created by scaling the dimensions of a pre-trained model (i.e., its width and depth). Subsequently, this super-network is simultaneously pruned (using an entropy constraint) and quantized (that is, ternary values are assigned layer-wise) in a training process, resulting in a sparse and ternary network representation. We validate the proposed approach in CIFAR-10, CIFAR-100, and ImageNet datasets, showing its effectiveness in image classification tasks.Comment: Proceedings of the CVPR'20 Joint Workshop on Efficient Deep Learning in Computer Vision. Code is available at https://github.com/d-becking/efficientCNN

arXiv.org e-Print Archive

Crossref

Explaining Recurrent Neural Network Predictions in Sentiment Analysis

Author: Arras Leila
Montavon Grégoire
Müller Klaus-Robert
Samek Wojciech
Publication venue
Publication date: 01/01/2017
Field of study

Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown to deliver insightful explanations in the form of input space relevances for understanding feed-forward neural network classification decisions. In the present work, we extend the usage of LRP to recurrent neural networks. We propose a specific propagation rule applicable to multiplicative connections as they arise in recurrent network architectures such as LSTMs and GRUs. We apply our technique to a word-based bi-directional LSTM model on a five-class sentiment prediction task, and evaluate the resulting LRP relevances both qualitatively and quantitatively, obtaining better results than a gradient-based related method which was used in previous work.Comment: 9 pages, 4 figures, accepted for EMNLP'17 Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA

arXiv.org e-Print Archive

Crossref

Bringing BCI into everyday life: Motor imagery in a pseudo realistic environment

Author: Brandl Stephanie
Höhne Johannes
Müller Klaus-Robert
Samek Wojciech
Publication venue
Publication date: 01/01/2015
Field of study

Bringing Brain-Computer Interfaces (BCIs) into everyday life is a challenge because an out-of-lab environment implies the presence of variables that are largely beyond control of the user and the software application. This can severely corrupt signal quality as well as reliability of BCI control. Current BCI technology may fail in this application scenario because of the large amounts of noise, nonstationarity and movement artifacts. In this paper, we systematically investigate the performance of motor imagery BCI in a pseudo realistic environment. In our study 16 participants were asked to perform motor imagery tasks while dealing with different types of distractions such as vibratory stimulations or listening tasks. Our experiments demonstrate that standard BCI procedures are not robust to theses additional sources of noise, implicating that methods which work well in a lab environment, may perform poorly in realistic application scenarios. We discuss several promising research directions to tackle this important problem.BMBF, 01GQ1115, Adaptive Gehirn-Computer-Schnittstellen (BCI) in nichtstationären Umgebunge

DepositOnce

Fraunhofer-ePrints